Skip to content

Conversation

@pmpailis
Copy link
Contributor

@pmpailis pmpailis commented Nov 11, 2025

In this PR we add support for column pruning for FORK branches.

The main idea is that we apply pruning in two steps:

  • First, we limit FORK's output based on the actual needed attributes in PruneColumns rule.
  • Then, based on these attributes, we proceed to prune each branch of the FORK plan independently through pruneSubPlan, with the same used params as base (i.e. FORK's output). We then proceed to compute separately any additional needed params for each branch and keep/remove plans as per usual in PruneColumns.

Closes #136365

@pmpailis pmpailis added :Search Relevance/ES|QL Search functionality in ES|QL >bug labels Nov 26, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @pmpailis, I've created a changelog YAML for you.

@pmpailis pmpailis force-pushed the fix_136365_prune_columns_when_fork branch from 708291d to 52893e6 Compare November 26, 2025 09:22
@pmpailis pmpailis marked this pull request as ready for review November 26, 2025 10:56
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Nov 26, 2025
@pmpailis
Copy link
Contributor Author

run elasticsearch-ci/part-1

Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Very good testing! 🔝

I have two questions regarding the pruning branches and the need to have a specific AttributeSet for ignoring IDs.

* This is useful for Fork plans, where branches may have different Attribute IDs but share a common output schema,
* allowing equality checks of used attributes based on their names.
*/
public static class ForkBuilder extends Builder {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case you want to avoid IDs, I think you can reuse Attribute.IdIgnoringWrapper.

See here for an example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking into that, but I think that it introduced too much clutter, as it needed to be bidirectional (and preferred to keep PruneColumns as untouched as possible).

Will take another look though as I could very well be missing something, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason for a new builder, was to take advantage of pruneUnusedAndAddReferences so as to reuse as much code as possible through the existing check in

   if (used.contains(attr)) ...

I'm looking into whether we can change the attributes initially passed to the builder and avoid this new instance.

Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM - though I suggest that @ioanatia takes a look at it before merging as she's the FORK expert 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Search Relevance/ES|QL Search functionality in ES|QL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ES|QL: PruneColumns fails to prune columns when forked

3 participants